40 research outputs found

    TCBERT: A Technical Report for Chinese Topic Classification BERT

    Full text link
    Bidirectional Encoder Representations from Transformers or BERT~\cite{devlin-etal-2019-bert} has been one of the base models for various NLP tasks due to its remarkable performance. Variants customized for different languages and tasks are proposed to further improve the performance. In this work, we investigate supervised continued pre-training~\cite{gururangan-etal-2020-dont} on BERT for Chinese topic classification task. Specifically, we incorporate prompt-based learning and contrastive learning into the pre-training. To adapt to the task of Chinese topic classification, we collect around 2.1M Chinese data spanning various topics. The pre-trained Chinese Topic Classification BERTs (TCBERTs) with different parameter sizes are open-sourced at \url{https://huggingface.co/IDEA-CCNL}

    Ziya-Visual: Bilingual Large Vision-Language Model via Multi-Task Instruction Tuning

    Full text link
    Recent advancements enlarge the capabilities of large language models (LLMs) in zero-shot image-to-text generation and understanding by integrating multi-modal inputs. However, such success is typically limited to English scenarios due to the lack of large-scale and high-quality non-English multi-modal resources, making it extremely difficult to establish competitive counterparts in other languages. In this paper, we introduce the Ziya-Visual series, a set of bilingual large-scale vision-language models (LVLMs) designed to incorporate visual semantics into LLM for multi-modal dialogue. Composed of Ziya-Visual-Base and Ziya-Visual-Chat, our models adopt the Querying Transformer from BLIP-2, further exploring the assistance of optimization schemes such as instruction tuning, multi-stage training and low-rank adaptation module for visual-language alignment. In addition, we stimulate the understanding ability of GPT-4 in multi-modal scenarios, translating our gathered English image-text datasets into Chinese and generating instruction-response through the in-context learning method. The experiment results demonstrate that compared to the existing LVLMs, Ziya-Visual achieves competitive performance across a wide range of English-only tasks including zero-shot image-text retrieval, image captioning, and visual question answering. The evaluation leaderboard accessed by GPT-4 also indicates that our models possess satisfactory image-text understanding and generation capabilities in Chinese multi-modal scenario dialogues. Code, demo and models are available at ~\url{https://huggingface.co/IDEA-CCNL/Ziya-BLIP2-14B-Visual-v1}

    Orca: A Few-shot Benchmark for Chinese Conversational Machine Reading Comprehension

    Full text link
    The conversational machine reading comprehension (CMRC) task aims to answer questions in conversations, which has been a hot research topic in recent years because of its wide applications. However, existing CMRC benchmarks in which each conversation is assigned a static passage are inconsistent with real scenarios. Thus, model's comprehension ability towards real scenarios are hard to evaluate reasonably. To this end, we propose the first Chinese CMRC benchmark Orca and further provide zero-shot/few-shot settings to evaluate model's generalization ability towards diverse domains. We collect 831 hot-topic driven conversations with 4,742 turns in total. Each turn of a conversation is assigned with a response-related passage, aiming to evaluate model's comprehension ability more reasonably. The topics of conversations are collected from social media platform and cover 33 domains, trying to be consistent with real scenarios. Importantly, answers in Orca are all well-annotated natural responses rather than the specific spans or short phrase in previous datasets. Besides, we implement three strong baselines to tackle the challenge in Orca. The results indicate the great challenge of our CMRC benchmark. Our datatset and checkpoints are available at https://github.com/nuochenpku/Orca.Comment: 14 page

    Potential of Core-Collapse Supernova Neutrino Detection at JUNO

    Get PDF
    JUNO is an underground neutrino observatory under construction in Jiangmen, China. It uses 20kton liquid scintillator as target, which enables it to detect supernova burst neutrinos of a large statistics for the next galactic core-collapse supernova (CCSN) and also pre-supernova neutrinos from the nearby CCSN progenitors. All flavors of supernova burst neutrinos can be detected by JUNO via several interaction channels, including inverse beta decay, elastic scattering on electron and proton, interactions on C12 nuclei, etc. This retains the possibility for JUNO to reconstruct the energy spectra of supernova burst neutrinos of all flavors. The real time monitoring systems based on FPGA and DAQ are under development in JUNO, which allow prompt alert and trigger-less data acquisition of CCSN events. The alert performances of both monitoring systems have been thoroughly studied using simulations. Moreover, once a CCSN is tagged, the system can give fast characterizations, such as directionality and light curve

    Detection of the Diffuse Supernova Neutrino Background with JUNO

    Get PDF
    As an underground multi-purpose neutrino detector with 20 kton liquid scintillator, Jiangmen Underground Neutrino Observatory (JUNO) is competitive with and complementary to the water-Cherenkov detectors on the search for the diffuse supernova neutrino background (DSNB). Typical supernova models predict 2-4 events per year within the optimal observation window in the JUNO detector. The dominant background is from the neutral-current (NC) interaction of atmospheric neutrinos with 12C nuclei, which surpasses the DSNB by more than one order of magnitude. We evaluated the systematic uncertainty of NC background from the spread of a variety of data-driven models and further developed a method to determine NC background within 15\% with {\it{in}} {\it{situ}} measurements after ten years of running. Besides, the NC-like backgrounds can be effectively suppressed by the intrinsic pulse-shape discrimination (PSD) capabilities of liquid scintillators. In this talk, I will present in detail the improvements on NC background uncertainty evaluation, PSD discriminator development, and finally, the potential of DSNB sensitivity in JUNO

    Solving Math Word Problem via Cooperative Reasoning induced Language Models

    Full text link
    Large-scale pre-trained language models (PLMs) bring new opportunities to challenge problems, especially those that need high-level intelligence, such as the math word problem (MWPs). However, directly applying existing PLMs to MWPs can fail as the generation process lacks sufficient supervision and thus lacks fast adaptivity as humans. We notice that human reasoning has a dual reasoning framework that consists of an immediate reaction system (system 1) and a delicate reasoning system (system 2), where the entire reasoning is determined by their interaction. This inspires us to develop a cooperative reasoning-induced PLM for solving MWPs, called Cooperative Reasoning (CoRe), resulting in a human-like reasoning architecture with system 1 as the generator and system 2 as the verifier. In our approach, the generator is responsible for generating reasoning paths, and the verifiers are used to supervise the evaluation in order to obtain reliable feedback for the generator. We evaluate our CoRe framework on several mathematical reasoning datasets and achieve decent improvement over state-of-the-art methods, up to 9.8% increase over best baselines.Comment: The experimental results are not sufficien

    Freestanding layer-structure selenium cathodes with ultrahigh Se loading for high areal capacity Li-Se batteries

    No full text
    In this work, a freestanding layer-structure Se cathode, composed of alternant barrier layers and active layers, is synthesized via a facile syringe-filtration strategy. With employing the barrier layers simultaneously acting as polyselenide-interception barriers and as highly conductive current collectors, the active layers with 3D porous architecture are endowed with strong polyselenide-trapping capability and fast redox-conversion capability. Such novel Se cathode readily achieves a high Se loading up to 13.5 mg cm−2. The optimized Se cathode with a high Se loading of 4.5 mg cm−2 exhibits high specific capacity/areal capacity (794 mA h g−1/3.6 mA h cm−2), excellent cycling stability (508 mA h g−1@300 cycles), and remarkable rate capability (389 mA h g−1@800 mA g−1), which is superior to conventional Se cathodes. Keywords: Lithium-selenium batteries, Se cathode, Alternant layer structure, Syringe-filtration method, Areal capacit

    Supercritical carbon dioxide technology in synthesis, modification, and recycling of battery materials

    No full text
    Abstract For pursuing the ambitious goals in the burgeoning electric vehicles, portable electronic devices, and energy storage sectors, Li‐ion batteries (LIBs) are considered as one of the most promising electrochemical power sources because of their high energy density and moderate cost. Particularly, the improvement of battery materials and recycling of spent LIBs are receiving great attention since the sustainable approaches for the synthesis, modification, and recycling of battery materials are the crucial factors to the successful large‐scale implementation of LIBs. In this regard, supercritical carbon dioxide (SC‐CO2), which possesses many merits, such as environmentally friendly, low‐cost, individual chemical environment, and especially its unique physical properties, has been employed as solvent and reaction medium in the synthesis and modification of diverse functional materials. In this review, we mainly aim at compiling the applications of SC‐CO2 technology in the synthesis and modification of electrode materials as well as the recycling of LIBs. First, the unique properties and principles of SC‐CO2 technology are highlighted. Second, the latest progresses of the electrode materials design and recycling with the assistance of SC‐CO2 technique are summarized. Finally, the challenges, future directions, and perspectives on the design and development of battery materials and battery recycling by SC‐CO2 technology are proposed

    Confining Sulfur in N‑Doped Porous Carbon Microspheres Derived from Microalgaes for Advanced Lithium–Sulfur Batteries

    No full text
    Lithium–sulfur (Li–S) battery is one of the most attractive candidates for the next-generation energy storage system. However, the intrinsic insulating nature of sulfur and the notorious polysulfide shuttle are the major obstacles, which hinder the commercial application of Li–S battery. Confining sulfur into conductive porous carbon matrices with designed polarized surfaces is regarded as a promising and effective strategy to overcome above issues. Herein, we propose to use microalgaes (<i>Schizochytrium sp.</i>) as low-cost, renewable carbon/nitrogen precursors and biological templates to synthesize N-doped porous carbon microspheres (NPCMs). These rational designed NPCMs can not only render the sulfur-loaded NPCMs (NPCSMs) composites with high electronic conductivity and sulfur content, but also greatly suppress the diffusion of polysulfides by strongly physical and chemical adsorptions. As a result, NPCSMs cathode demonstrates a superior reversible capacity (1030.7 mA h g<sup>–1</sup>) and remarkable capacity retention (91%) at 0.1 A g<sup>–1</sup> after 100 cycles. Even at an extremely high current density of 5 A g<sup>–1</sup>, NPCSMs still can deliver a satisfactory discharge capacity of 692.3 mAh g<sup>–1</sup>. This work reveals a sustainable and effective biosynthetic strategy to fabricate N-doped porous carbon matrices for high performance sulfur cathode in Li–S battery, as well as offers a fascinating possibility to rationally design and synthesize novel carbon-based composites
    corecore